-
-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new --no-use-system-time
flag to use a deterministic timestamp in built PEX
#722
Conversation
Much better and works accross Py27 and Py3!
Was accidently using the zip's overall filename to determine permissions, rather than the actual files themselves.
This reverts commit 829d12d.
--no-use-system-time
flag to use a deterministic timestamp in built PEXe--no-use-system-time
flag to use a deterministic timestamp in built PEX
@@ -302,6 +302,18 @@ def configure_clp_pex_options(parser): | |||
'likely will not be reproducible, meaning that if you were to run `./pex -o` with the ' | |||
'same inputs then the new pex would not be byte-for-byte identical to the original.') | |||
|
|||
group.add_option( | |||
'--use-system-time', '--no-use-system-time', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to suggestions for a better name.
The main use case I want to optimize for is when deterministic time will be the default. For example, atm --use-deterministic-time
would probably be a good name, but once we make the switch then --no-use-deterministic-time
wouldn't be very descriptive for what the behavior would end up being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pantsbuild/pants#6094 we made a flag -jar-creation-time
which, if not set, defaulted to the current time. Not sure whether this is a good or bad default for pex, but it's some prior art :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@illicitonion : I think that we agreed on the other issue that we want this enabled by default, so the default would need to be stable.
@Eric-Arellano : This name looks good I think?
We use the result many times, and it is not expected to ever change between a pex run, so this is better set to be a property. We eagerly evaluate the constant for simplicity of implementation. Because the calculation is so cheap, it is not necessary to lazily evaluate.
If it's a constant, theen it becomes very difficult to test that $SOURCE_DATE_EPOCH works because constants get evaluated at import time, and at import time os.environ does not have the env var set. It will ignore the test setting the value, because the constant never gets re-evaluated.
We have no compelling use case for it and it complicates our code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! FTR: Daniel is out until our Tuesday night, so you might wait for a John review and then go ahead with landing.
RE: the SOURCE_DATE_EPOCH - this is just a flag in spirit that allows external tools like Pants to control the timestamp. Presumably Pants would pass a single value no matter what platform it was run on. As such I don't really follow the objection from Pants point of view except for the fact that:
All said though I'm happy enough to start with a fixed hardcoded value for simplicity sake. We could always add support later. |
We don't need to explain this here because readers of the code are unlikely to anticipate it being an issue, so are likely to get distracted more than anything. Instead, this PR description is updated to make the explanation if anyone is wondering "wait why didn't they do this?" Also make the MS-DOS explanation more concise.
Makes the code more consistent and simpler.
Thanks @jsirois for the thoughtful review. Great suggestions. I took them and added a function This allowed me to refactor |
We have no need for the more general datetime format yet. Don't make things more general until necessary!
My bad for not testing before pushing.
Problem
Part of #716 to make built PEXes be reproducible.
One of the major sources of unreproducibility is timestamps embedded into the zipfile. Python's
zipfile.py
uses the current system time when creating a new zipfile, per https://github.com/python/cpython/blob/45e92fc02d57a7219f4f4922179929f19b3d1c28/Lib/zipfile.py#L509. So, the Pex cannot be reproducible if built after the original run.Solution
Introduce a new flag
--use-system-time
/--no-use-system-time
that will toggle whether we use a deterministic timestamp or keep the default behavior of using the current system time. We will default to--use-system-time
for now to keep prior behavior, and then in a future release like 1.70 will switch over to default to reproducible builds.If
--no-use-system-time
is specified, we use midnight at January 1, 1980, which is the start of the MS-DOS time used by Zipfiles. We then use this value to construct aZipInfo
with the desired date.Not respecting
SOURCE_DATE_EPOCH
We do not respect the standard env var
$SOURCE_DATE_EPOCH
often used by tools for deterministic timestamps, as doing so risks reducing the reproducibility of built pexes across different platforms, e.g. if two platforms set the env var differently. Usually, this is supposed to be set for the sake of security to check that a shipped binary was not tampered with, but that is not our primary use case, so we do not respect it both for simplicity of our codebase and to avoid this potential reduction in reproducibility. Refer to https://reproducible-builds.org/docs/source-date-epoch/ for more information.Result
4 / 7 of the acceptance tests for reproducibility now pass consistently!
When setting CI to always use a deterministic timestamp, CI was green so things will work as expected when we change the default to deterministic timestamps.